Quality Anomaly Detection Using Predictive Techniques: An Extensive Big Data Quality Framework for Reliable Data Analysis

نویسندگان

چکیده

The increasing reliance on Big Data analytics has highlighted the critical role of data quality in ensuring accurate and reliable results. Consequently, organizations aiming to leverage power recognize crucial as an integral component. One notable type anomaly observed big datasets is presence outlier values. Detecting addressing these outliers have become a subject interest across diverse domains, leading development numerous detection approaches. Although witnessed proliferation practices recent years, significant gap remains anomalies related other aspects quality. Indeed, while most approaches focus identifying that deviate from expected patterns, they do not consider irregularities quality, such missing, incorrect, or inconsistent data. Moreover, are domain-correlated lack capability detect generic manner. Thus, we aim through this paper address field provide holistic effective solution for detection. To achieve this, suggest novel approach allows comprehensive six dimensions: Accuracy, Consistency, Completeness, Conformity, Uniqueness, Readability. framework sophisticated implementation intelligent model without any correlation specific field. Furthermore, introduce measure new metric called “Quality Anomaly Score,” which refers degree anomalousness each dimension entire dataset. Through evaluation our framework, suggested achieved accuracy score up 99.91% F-score 98.07%.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Big Data Quality: From Content to Context

Over the last 20 years, and particularly with the advent of Big Data and analytics, the research area around Data and Information Quality (DIQ) is still a fast growing research area. There are many views and streams in DIQ research, generally aiming at improving the effectiveness of decision making in organizations. Although there are a lot of researches aimed at clarifying the role of BIG data...

متن کامل

From Data Quality to Big Data Quality

This article investigates the evolution of data quality issues from traditional structured data managed in relational databases to Big Data. In particular, The paper examines the nature of the relationship between Data Quality and several research coordinates that are relevant in Big Data, such as the variety of data types, data sources and application domains, focusing on maps, semistructured ...

متن کامل

An Investigation of Performance Analysis of Anomaly Detection Techniques for Big Data in SCADA Systems

Anomaly detection is an important aspect of data mining, where the main objective is to identify anomalous or unusual data from a given dataset. However, there is no formal categorization of application-specific anomaly detection techniques for big data and this ignites a confusion for the data miners. In this paper, we categorise anomaly detection techniques based on nearest neighbours, cluste...

متن کامل

Anomaly Detection In Cellular Network Data Using Big Data Analytics

Anomaly detection is a key component in which perturbations from a normal behavior suggests a misconfigured/mismatched data in related systems. In this paper, we present a call detail record based anomaly detection method (CADM) that analyzes the users’s calling activities and detects the abnormal behavior of user movements in a real cellular network. CADM is capable of detecting the location o...

متن کامل

Anomaly Detection for Industrial Big Data

As the Industrial Internet of Things (IIoTa) grows, systems are increasingly being monitored by arrays of sensors returning time-series data at ever-increasing ‘volume, velocity and variety’b (i.e. Industrial Big Datac). An obvious use for these data is real-time systems condition monitoring and prognostic time to failure analysis (remaining useful life, RUL). (e.g. See white papers by Senseye....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2023

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2023.3317354